Search CORE

1,162 research outputs found

Bootstrapping Graph Convolutional Neural Networks for Autism Spectrum Disorder Classification

Author: Anirudh Rushil
Thiagarajan Jayaraman J.
Publication venue
Publication date: 29/10/2018
Field of study

Using predictive models to identify patterns that can act as biomarkers for different neuropathoglogical conditions is becoming highly prevalent. In this paper, we consider the problem of Autism Spectrum Disorder (ASD) classification where previous work has shown that it can be beneficial to incorporate a wide variety of meta features, such as socio-cultural traits, into predictive modeling. A graph-based approach naturally suits these scenarios, where a contextual graph captures traits that characterize a population, while the specific brain activity patterns are utilized as a multivariate signal at the nodes. Graph neural networks have shown improvements in inferencing with graph-structured data. Though the underlying graph strongly dictates the overall performance, there exists no systematic way of choosing an appropriate graph in practice, thus making predictive models non-robust. To address this, we propose a bootstrapped version of graph convolutional neural networks (G-CNNs) that utilizes an ensemble of weakly trained G-CNNs, and reduce the sensitivity of models on the choice of graph construction. We demonstrate its effectiveness on the challenging Autism Brain Imaging Data Exchange (ABIDE) dataset and show that our approach improves upon recently proposed graph-based neural networks. We also show that our method remains more robust to noisy graphs

arXiv.org e-Print Archive

A Generative Modeling Approach to Limited Channel ECG Classification

Author: Rajan Deepta
Thiagarajan Jayaraman J.
Publication venue
Publication date: 13/06/2018
Field of study

Processing temporal sequences is central to a variety of applications in health care, and in particular multi-channel Electrocardiogram (ECG) is a highly prevalent diagnostic modality that relies on robust sequence modeling. While Recurrent Neural Networks (RNNs) have led to significant advances in automated diagnosis with time-series data, they perform poorly when models are trained using a limited set of channels. A crucial limitation of existing solutions is that they rely solely on discriminative models, which tend to generalize poorly in such scenarios. In order to combat this limitation, we develop a generative modeling approach to limited channel ECG classification. This approach first uses a Seq2Seq model to implicitly generate the missing channel information, and then uses the latent representation to perform the actual supervisory task. This decoupling enables the use of unsupervised data and also provides highly robust metric spaces for subsequent discriminative learning. Our experiments with the Physionet dataset clearly evidence the effectiveness of our approach over standard RNNs in disease prediction

arXiv.org e-Print Archive

Understanding Behavior of Clinical Models under Domain Shifts

Author: Rajan Deepta
Sattigeri Prasanna
Thiagarajan Jayaraman J.
Publication venue
Publication date: 13/06/2019
Field of study

The hypothesis that computational models can be reliable enough to be adopted in prognosis and patient care is revolutionizing healthcare. Deep learning, in particular, has been a game changer in building predictive models, thus leading to community-wide data curation efforts. However, due to inherent variabilities in population characteristics and biological systems, these models are often biased to the training datasets. This can be limiting when models are deployed in new environments, when there are systematic domain shifts not known a priori. In this paper, we propose to emulate a large class of domain shifts, that can occur in clinical settings, with a given dataset, and argue that evaluating the behavior of predictive models in light of those shifts is an effective way to quantify their reliability. More specifically, we develop an approach for building realistic scenarios, based on analysis of \textit{disease landscapes} in multi-label classification. Using the openly available MIMIC-III EHR dataset for phenotyping, for the first time, our work sheds light into data regimes where deep clinical models can fail to generalize. This work emphasizes the need for novel validation mechanisms driven by real-world domain shifts in AI for healthcare

arXiv.org e-Print Archive

A Regularized Attention Mechanism for Graph Attention Networks

Author: Shanthamallu Uday Shankar
Spanias Andreas
Thiagarajan Jayaraman J.
Publication venue
Publication date: 10/02/2020
Field of study

Machine learning models that can exploit the inherent structure in data have gained prominence. In particular, there is a surge in deep learning solutions for graph-structured data, due to its wide-spread applicability in several fields. Graph attention networks (GAT), a recent addition to the broad class of feature learning models in graphs, utilizes the attention mechanism to efficiently learn continuous vector representations for semi-supervised learning problems. In this paper, we perform a detailed analysis of GAT models, and present interesting insights into their behavior. In particular, we show that the models are vulnerable to heterogeneous rogue nodes and hence propose novel regularization strategies to improve the robustness of GAT models. Using benchmark datasets, we demonstrate performance improvements on semi-supervised learning, using the proposed robust variant of GAT

arXiv.org e-Print Archive

Learning Stable Multilevel Dictionaries for Sparse Representations

Author: Ramamurthy Karthikeyan Natesan
Spanias Andreas
Thiagarajan Jayaraman J.
Publication venue
Publication date: 25/09/2013
Field of study

Sparse representations using learned dictionaries are being increasingly used with success in several data processing and machine learning applications. The availability of abundant training data necessitates the development of efficient, robust and provably good dictionary learning algorithms. Algorithmic stability and generalization are desirable characteristics for dictionary learning algorithms that aim to build global dictionaries which can efficiently model any test data similar to the training samples. In this paper, we propose an algorithm to learn dictionaries for sparse representations from large scale data, and prove that the proposed learning algorithm is stable and generalizable asymptotically. The algorithm employs a 1-D subspace clustering procedure, the K-hyperline clustering, in order to learn a hierarchical dictionary with multiple levels. We also propose an information-theoretic scheme to estimate the number of atoms needed in each level of learning and develop an ensemble approach to learn robust dictionaries. Using the proposed dictionaries, the sparse code for novel test data can be computed using a low-complexity pursuit procedure. We demonstrate the stability and generalization characteristics of the proposed algorithm using simulations. We also evaluate the utility of the multilevel dictionaries in compressed recovery and subspace learning applications

arXiv.org e-Print Archive

Attend and Diagnose: Clinical Time Series Analysis using Attention Models

Author: Rajan Deepta
Song Huan
Spanias Andreas
Thiagarajan Jayaraman J.
Publication venue
Publication date: 19/11/2017
Field of study

With widespread adoption of electronic health records, there is an increased emphasis for predictive models that can effectively deal with clinical time-series data. Powered by Recurrent Neural Network (RNN) architectures with Long Short-Term Memory (LSTM) units, deep neural networks have achieved state-of-the-art results in several clinical prediction tasks. Despite the success of RNNs, its sequential nature prohibits parallelized computing, thus making it inefficient particularly when processing long sequences. Recently, architectures which are based solely on attention mechanisms have shown remarkable success in transduction tasks in NLP, while being computationally superior. In this paper, for the first time, we utilize attention models for clinical time-series modeling, thereby dispensing recurrence entirely. We develop the \textit{SAnD} (Simply Attend and Diagnose) architecture, which employs a masked, self-attention mechanism, and uses positional encoding and dense interpolation strategies for incorporating temporal order. Furthermore, we develop a multi-task variant of \textit{SAnD} to jointly infer models with multiple diagnosis tasks. Using the recent MIMIC-III benchmark datasets, we demonstrate that the proposed approach achieves state-of-the-art performance in all tasks, outperforming LSTM models and classical baselines with hand-engineered features.Comment: AAAI 201

arXiv.org e-Print Archive

An Unsupervised Approach to Solving Inverse Problems using Generative Adversarial Networks

Author: Anirudh Rushil
Bremer Timo
Kailkhura Bhavya
Thiagarajan Jayaraman J.
Publication venue
Publication date: 04/06/2018
Field of study

Solving inverse problems continues to be a challenge in a wide array of applications ranging from deblurring, image inpainting, source separation etc. Most existing techniques solve such inverse problems by either explicitly or implicitly finding the inverse of the model. The former class of techniques require explicit knowledge of the measurement process which can be unrealistic, and rely on strong analytical regularizers to constrain the solution space, which often do not generalize well. The latter approaches have had remarkable success in part due to deep learning, but require a large collection of source-observation pairs, which can be prohibitively expensive. In this paper, we propose an unsupervised technique to solve inverse problems with generative adversarial networks (GANs). Using a pre-trained GAN in the space of source signals, we show that one can reliably recover solutions to under determined problems in a `blind' fashion, i.e., without knowledge of the measurement process. We solve this by making successive estimates on the model and the solution in an iterative fashion. We show promising results in three challenging applications -- blind source separation, image deblurring, and recovering an image from its edge map, and perform better than several baselines

arXiv.org e-Print Archive

Calibrating Healthcare AI: Towards Reliable and Interpretable Deep Predictive Models

Author: Rajan Deepta
Sattigeri Prasanna
Thiagarajan Jayaraman J.
Venkatesh Bindya
Publication venue
Publication date: 27/04/2020
Field of study

The wide-spread adoption of representation learning technologies in clinical decision making strongly emphasizes the need for characterizing model reliability and enabling rigorous introspection of model behavior. While the former need is often addressed by incorporating uncertainty quantification strategies, the latter challenge is addressed using a broad class of interpretability techniques. In this paper, we argue that these two objectives are not necessarily disparate and propose to utilize prediction calibration to meet both objectives. More specifically, our approach is comprised of a calibration-driven learning method, which is also used to design an interpretability technique based on counterfactual reasoning. Furthermore, we introduce \textit{reliability plots}, a holistic evaluation mechanism for model reliability. Using a lesion classification problem with dermoscopy images, we demonstrate the effectiveness of our approach and infer interesting insights about the model behavior

arXiv.org e-Print Archive

Self-Training with Improved Regularization for Sample-Efficient Chest X-Ray Classification

Author: Karargyris Alexandros
Kashyap Satyananda
Rajan Deepta
Thiagarajan Jayaraman J.
Publication venue
Publication date: 10/02/2021
Field of study

Automated diagnostic assistants in healthcare necessitate accurate AI models that can be trained with limited labeled data, can cope with severe class imbalances and can support simultaneous prediction of multiple disease conditions. To this end, we present a deep learning framework that utilizes a number of key components to enable robust modeling in such challenging scenarios. Using an important use-case in chest X-ray classification, we provide several key insights on the effective use of data augmentation, self-training via distillation and confidence tempering for small data learning in medical imaging. Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting

arXiv.org e-Print Archive

Beyond L2-Loss Functions for Learning Sparse Models

Author: Aravkin Aleksandr Y.
Ramamurthy Karthikeyan Natesan
Thiagarajan Jayaraman J.
Publication venue
Publication date: 26/03/2014
Field of study

Incorporating sparsity priors in learning tasks can give rise to simple, and interpretable models for complex high dimensional data. Sparse models have found widespread use in structure discovery, recovering data from corruptions, and a variety of large scale unsupervised and supervised learning problems. Assuming the availability of sufficient data, these methods infer dictionaries for sparse representations by optimizing for high-fidelity reconstruction. In most scenarios, the reconstruction quality is measured using the squared Euclidean distance, and efficient algorithms have been developed for both batch and online learning cases. However, new application domains motivate looking beyond conventional loss functions. For example, robust loss functions such as

\ell_1

and Huber are useful in learning outlier-resilient models, and the quantile loss is beneficial in discovering structures that are the representative of a particular quantile. These new applications motivate our work in generalizing sparse learning to a broad class of convex loss functions. In particular, we consider the class of piecewise linear quadratic (PLQ) cost functions that includes Huber, as well as

\ell_1

, quantile, Vapnik, hinge loss, and smoothed variants of these penalties. We propose an algorithm to learn dictionaries and obtain sparse codes when the data reconstruction fidelity is measured using any smooth PLQ cost function. We provide convergence guarantees for the proposed algorithm, and demonstrate the convergence behavior using empirical experiments. Furthermore, we present three case studies that require the use of PLQ cost functions: (i) robust image modeling, (ii) tag refinement for image annotation and retrieval and (iii) computing empirical confidence limits for subspace clustering.Comment: 10 pages, 6 figure

arXiv.org e-Print Archive